A Revisiting “Forward node-selecting queries over trees”

نویسنده

  • James Cheney
چکیده

XML is a World Wide Web Consortium (W3C) standard for tree-structured data. XPath [Clark and DeRose 1999] is an important language widely employed in XML query, transformation, and update languages. XPath is a language of path expressions that can be viewed as defining sets of nodes of a tree, by following axis steps and applying node tests or path-existence filters to navigate from the root of the tree. For example, the XPath expression /descendant::A[child::B] selects all nodes in the tree below the root whose label is A and that have a child labeled B; here, descendant and child are axis steps, and the brackets indicate a filter that tests for the existence of a path matching the expression inside. A significant complication in XPath is the presence of both forward and reverse axis steps. If implemented naively, by for example repeatedly traversing the tree in forward or backward directions, queries that mix forward and reverse edges can be very expensive to evaluate. For example, an XPath query

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Processing of Expressive Node-Selecting Queries on XML Data in Secondary Storage: A Tree Automata-based Approach

We propose a new, highly scalable and efficient technique for evaluating node-selecting queries on XML trees which is based on recent advances in the theory of tree automata. Our query processing techniques require only two linear passes over the XML data on disk, and their main memory requirements are in principle independent of the size of the data. The overall running time is O(m + n), where...

متن کامل

Learning Node Selecting Tree Transducer from Completely Annotated Examples

A base problem in Web information extraction is to find appropriate queries for informative nodes in trees. We propose to learn queries for nodes in trees automatically from examples. We introduce node selecting tree transducer (NSTT) and show how to induce deterministic NSTTs in polynomial time from completely annotated examples. We have implemented learning algorithms for NSTTs, started apply...

متن کامل

Schema-Guided Induction of Monadic Queries

The induction of monadic node selecting queries from partially annotated XML-trees is a key task in Web information extraction. We show how to integrate schema guidance into an RPNI-based learning algorithm, in which monadic queries are represented by pruning node selecting tree transducers. We present experimental results on schema guidance by the DTD of HTML.

متن کامل

Learning Monadic Queries for Semi-Structured Documents from Positive Examples

Querying for nodes in trees is a core operation for information extraction from semi-structured documents in XML or HTML. We show that regular monadic queries for nodes in trees can be identified from positive examples, and this in polynomial time when represented by deterministic node selecting transducers that we introduce.

متن کامل

Learning n-Ary Node Selecting Tree Transducers from Completely Annotated Examples

We present the first algorithm for learning n-ary node selection queries in trees from completely annotated examples by methods of grammatical inference. We propose to represent n-ary queries by deterministic n-ary node selecting tree transducers (n-NSTTs). These are tree automata that capture the class of monadic second-order definable nary queries. We show that n-NSTT defined polynomially bou...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013